ASAWA: An Automatic Partition Key Selection Strategy
نویسندگان
چکیده
With the rapid increase of data volume, more and more applications have to be implemented in a distributed environment. In order to obtain high performance, we need to carefully divide the whole dataset into multiple partitions and put them into distributed data nodes. During this process, the selection of partition key would greatly affect the overall performance. Nevertheless, there are few works addressing this topic. Most previous projects on data partitioning either utilize a simple strategy, or rely on a commercial database system, to choose partition keys. In this work, we present an automatic partition key selection strategy called ASAWA. It chooses partition keys according to the analysis on both dataset and workload schemas. In this way, intimate tuples, i.e. co-appearing in queries frequently, would be probably put into the same partition. Hence the cross-node joins could be greatly reduced and the system performance could be improved. We conduct a series of experiments over the TPC-H datasets to illustrate the effectiveness of the ASAWA strategy.
منابع مشابه
An Efficient Framework for Accurate Arterial Input Selection in DSC-MRI of Glioma Brain Tumors
Introduction: Automatic arterial input function (AIF) selection has an essential role in quantification of cerebral perfusion parameters. The purpose of this study is to develop an optimal automatic method for AIF determination in dynamic susceptibility contrast magnetic resonance imaging (DSC-MRI) of glioma brain tumors by using a new preprocessing method.Material and Methods: For this study, ...
متن کاملDimensionality Reduction and Improving the Performance of Automatic Modulation Classification using Genetic Programming (RESEARCH NOTE)
This paper shows how we can make advantage of using genetic programming in selection of suitable features for automatic modulation recognition. Automatic modulation recognition is one of the essential components of modern receivers. In this regard, selection of suitable features may significantly affect the performance of the process. Simulations were conducted with 5db and 10db SNRs. Test and ...
متن کاملConsistency Constraint Allocation in Augmented Lagrangian Coordination
Many engineering systems are too complex to design as a single entity. Decomposition-based design optimization methods partition a system design problem into subproblems, and coordinate subproblem solutions toward an optimal system design. Recent work has addressed formal methods for determining an ideal system partition and coordination strategy, but coordination decisions have been limited to...
متن کاملTest case selection for black-box regression testing of database applications
Context: This paper presents an approach for selecting regression test cases in the context of large-scale, database applications. We focus on a black-box (specification-based) approach, relying on classification tree models to model the input domain of the system under test (SUT), in order to obtain a more practical and scalable solution. We perform an industrial case study where the SUT is a ...
متن کاملA Framework for System-Level Partitioning of Object-Oriented Specifications
Object-oriented descriptions are gaining more and more importance in the high-level specification of hardware/software systems. Hiding the complexity from the developer is one of the key tasks in order to master the complexity of todays systems. With the high grade of abstraction necessary on the system level the automatic partitioning of a system is a difficult problem. In this paper we theref...
متن کامل